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ABSTRACT 

Discussion threads form a central part of the experience on many 
Web sites, including social networking sites such as Facebook and 
Google Plus and knowledge creation sites such as Wikipedia. To 
help users manage the challenge of allocating their attention among 
the discussions that are relevant to them, there has been a growing 
need for the algorithmic curation of on-line conversations — the 
development of automated methods to select a subset of discussions 
to present to a user. 

Here we consider two key sub-problems inherent in conversa- 
tional curation: length prediction — predicting the number of com- 
ments a discussion thread will receive — and the novel task of re- 
entry prediction — predicting whether a user who has participated 
in a thread will later contribute another comment to it. The first of 
these sub-problems arises in estimating how interesting a thread is, 
in the sense of generating a lot of conversation; the second can help 
determine whether users should be kept notified of the progress of 
a thread to which they have already contributed. We develop and 
evaluate a range of approaches for these tasks, based on an analy- 
sis of the network structure and arrival pattern among the partici- 
pants, as well as a novel dichotomy in the structure of long threads. 
We find that for both tasks, learning-based approaches using these 
sources of information yield improvements for all the performance 
metrics we used. 

Categories and Subject Descriptors: H.2.8: Data Mining 

General Terms: Measurement; Experimentation; Theory 

Keywords: user-generated content, comment threads, threads. Face- 
book, Wikipedia, conversations, likes, feed ranking, recommenda- 
tion, on-line communities, social networks, discussions 

1. INTRODUCTION 

Many Web sites are organized around a continuously evolving 
set of discussion threads. This style of interaction is a key compo- 
nent of on-line groups and message boards, social networking sites 
such as Facebook and Google Plus, and the workflow of collabora- 
tive projects such as Wikipedia and open-source development. In 
all these cases, a user must continuously decide how to allocate his 
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or her attention to a range of relevant discussions, and this can be a 
challenging task when content arrives at a rapid rate. 

A growing number of sites are helping users address this chal- 
lenge through the algorithmic curation of discussion threads, auto- 
matically selecting which threads to bring to a user's attention at 
any given point in time. A canonical example is Facebook's News 
Feed — for users with a sufficient number of active friends on the 
site, an unfiltered stream of all stories generated by friends is gener- 
ally much less valuable to the user than a ranked and filtered version 
of the stream that attempts to highlight the stories estimated to be 
most engaging to the user. 

The problem of curating discussion threads is thus a wide-ranging 
one in the context of applications, but it is one which for the most 
part has not been systematized in prior research. Our goal in this 
paper is to facilitate such a systematization, by identifying and for- 
malizing two important sub-problems in conversational curation, 
and then developing and evaluating techniques to address them. 
For our evaluation, we use discussion threads from two sites where 
such threads form a core part of the experience: discussions among 
users on Facebook and discussions among editors on Wikipedia. 
As on many other sites, threads on Facebook and Wikipedia can 
be conceptualized as an initial post and a subsequent sequence of 
comments; we will use this terminology in what follows. 

The present worli: Two problems in conversational curation. 

We now describe the two problems that we study, together with 
their motivation as components of conversational curation. 

1. Length prediction: given the initial portion of a thread (a post 
and the first few comments following it), how well can we 
predict the eventual length of the thread? We use this length 
prediction problem as a concretely formulated proxy for the 
general issue of estimating the level of interest a thread will 
generate, based on observation of its early stages. 

2. Re-entry prediction: given the initial portion of a thread and 
the identity of one of the commenters, how well can we pre- 
dict whether this commenter will contribute another com- 
ment later in the thread? This is a key issue in determining 
whether to keep a user notified of the progress on a thread 
once he or she has contributed to it — some threads have the 
structure of a conversation where users are motivated to re- 
turn repeatedly, while others involve each user contributing 
once (for example, to offer congratulations or condolences) 
but then not returning. 

Taken together, these two problems cover a set of central issues 
in conversational curation: identifying threads that will generate 
sustained interest, so as to be able to highlight them to users, and 



recognizing whiether a tiiread is something that a contributing user 
will want to continue to follow as it evolves. 

We develop techniques for these problems by first analyzing the 
structure of threads, and then formulating a set of properties that 
we in turn use for the prediction tasks. 

We begin by investigating the following issue, on data described 
in ^ Intuitively, one feels from experience that there are two dis- 
tinct types of long threads: those that become long because a small 
group of people engage in an extensive conversation via the com- 
ments, and those that become long because many users each con- 
tribute a single comment. A canonical example of the latter would 
begin with a post in which a user announces a major life event, 
and then many friends contribute congratulations in the comments 
as in a wedding guestbook. We refer to the first type of thread as 
focused, and the second type as expansionary. 

But is this notion of two types simply one's perception of two ex- 
tremes of a broad distribution, or is there quantitative evidence for 
it? We find (331 in fact that threads genuinely exhibit this two-type 
effect: for long threads, the distribution of the number of distinct 
commenters is bimodal, with threads either dominated by a very 
small number of distinct users, or by a sequence of commenters 
who generally do not return to the thread after commenting once. 
In addition to providing what is, to our knowledge, the first evi- 
dence for this basic dichotomy, this finding helps reinforce the im- 
portance of our second problem — re-entry prediction — by estab- 
lishing that active discussion threads can vary considerably in the 
extent to which participants are interested in returning after their 
initial contribution. 

In order to build a framework for approaching our two basic 
problems, we begin by studying (Q a range of related thread prop- 
erties. One of the most useful of these is the thread's arrival pattern 

— the ordering by which new entrants into the thread are inter- 
leaved with returning participants. Formalizing this notion allows 
us to work with relaxed versions of the two extremes of focused and 
expansionary threads discussed above, and to explore the region 
that interpolates between them. We also study network and tempo- 
ral structure: whether the first few commenters are linked within 
a social network, and how quickly after the post do they arrive in 
real-time; both convey information about the future trajectory of 
the thread. 

We incorporate these properties into a machine learning approach 
for predicting length (^ and re-entry (^. Evaluating the predic- 
tion performance enables us to identify the features that are most 
effective for our two problems. At a high level, we find that the 
structure of the arrival pattern is the most useful for re-entry pre- 
diction, while temporal properties together with the arrival pattern 
give the strongest performance for length prediction. 

Next, in ^\ we explore a probabilistic model of participant re- 
entry related to the dichotomy between focused threads and expan- 
sionary ones. Clearly some styles of post tend to lead to one type of 
thread or the other, but for other kinds of posts, one sees both types 
of threads emerge; for example, the same shared link to a news 
story can generate a focused thread when it is shared among one set 
of users and an expansionary thread when it is shared among a dif- 
ferent set. It is therefore natural to ask whether a type of symmetry- 
breaking can arise directly from the dynamics of a discussion itself 

— that is, whether there is a simple probabilistic generative model 
capable of producing both focused and expansionary threads over 
different realizations of its random trajectory. We show how to con- 
struct such a model from plausible assumptions about turn-taking 
and new entrants in discussion threads; the model exposes inter- 
esting connections between discussion threads and nonlinear urn 
processes. 



In ^ we review related work on the dynamics of on-line discus- 
sions. For now, we note that the general issue of thread length has 
been studied, using different techniques, in contexts distinct from 
ours — primarily for comments on blog and news sites, where es- 
sentially all threads are expansionary, with many participants who 
typically contribute only once or very few times each 1 14, 23, 25] 
l26i . In contrast, our approach incorporating the notion that there 
can be multiple structurally distinct types of long threads is suited 
to settings where the participants maintain long-running relation- 
ships with one another. These structural distinctions also provide a 
core part of the motivation for re-entry prediction, which is a key 
issue for organizing conversations in these settings; the problem of 
re-entry prediction has not, to our knowledge, been formulated or 
studied previously. 

2. DATA AND BASIC DEFINITIONS 

We use data from Facebook and Wikipedia to construct three dis- 
tinct populations of users whose discussion threads we study. We 
choose Facebook as perhaps the most well-known example of a 
post-plus-comments interface for socially-oriented conversations. 
Conversations among Wikipedia editors form a contrasting case 
that has also received research attention fTl lOirT?] : the discus- 
sions are task-oriented, as opposed to socially-oriented, and there 
is no formal structure imposed on conversations by the interface; 
nonetheless, they can still be naturally treated as instances of com- 
ment threads. 

For completeness, we briefly describe the structure of these dis- 
cussion threads at a general level. On Facebook, we study instances 
in which a user posts a status update, and then other users with per- 
mission to comment on the status update contribute comments to 
it. On Wikipedia, editors interact on talk-pages to discuss issues 
concerning articles, projects or Wikipedia policies. Each editor has 
the option of hosting a talk-page, and most active users do. 

On both Facebook and Wikipedia, we will refer to the status up- 
date or initiating text as the post; the sequence of comments that 
follows the post will be called the comment thread, and the post 
together with all the comments will be called the full thread. The 
poster together with the commenters in a full thread will be called 
the full thread's participants. The number of items in the thread 
(including the post in the case of a full thread, but not in a com- 
ment thread) will be called its length or its volume; we use these 
two terms synonymously. 

From Facebook, we first selected 100,000 users uniformly at ran- 
dom from the population of US Facebook users. We will refer to 
this set U in our analysis as the uniform Facebook population. Also, 
out of all US Facebook users who posted beween 200 and 300 sta- 
tus updates over an 80-day period, we randomly selected 100,000 
of these heavily engaged users. We will refer to this set A as the 
high-activity Facebook population. For both U and A, we study 
the comment threads associated with all their posts during the same 
80-day period. All Facebook data was used anonymously, and all 
analysis was done in aggregate. 

Our Wikipedia data is derived from the corpus of Danescu-Niculescu- 
Mizil et al. (7). We used 118,447 conversation threads of length at 
least 1 (to discard posts made by automated bots, which never at- 
tract responses) which took place asynchronously on the talk-pages 
hosted by 6,555 highly active editors; posts average 2.12 com- 
ments. We also use the content of the talk-pages to assess the exis- 
tence of an interactional link between a given pair of Wikipedia edi- 
tors: we say that two editors are linked if at least one of them added 
a post or comment on the other's editor talk page. Our Wikipedia 
data will be available at 
http : //www .mpi-swE . org/"'Cristian/Echoes_of_power . html. 
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Figure 1: Heat map (best viewed in color) of the density 
functions on distinct commenters, uniform Facebook popula- 
tion. For each thread prefix length k, there is a peak in density 
(lighter color) at a small number and a second peak approxi- 
mately at the maximum number of distinct participants d. 

3. FOCUSED VS. EXPANSIONARY THREADS 

If a post leads to a long comment thread, then it is one that at- 
tracts a great deal of attention and so is likely of interest; thus, the 
thread-length prediction problem is crucial to the curating of con- 
versations. In thinking about how to bring long threads to users' at- 
tention, though, a natural question is whether there are sub-classes 
of such conversations that should be treated differently. 

This question leads us to conjecture that there is a dichotomy be- 
tween expansionary high-activity threads, created by the one-time 
actions of many different "drive-by" commenters, versus focused 
high-activity threads, reflecting a high-level of repeated engage- 
ment among relatively few people. In this section, we provide sup- 
porting evidence for this conjecture and discuss its consequences. 

Distinct Participants: Two Local Maxima. To investigate the 
validity of our conjecture, we consider how the number of distinct 
participants in a thread is distributed. To do so, we must account 
for two issues. First, we do not want the idiosyncratic actions of 
any one high-volume user to dominate the quantities involved, so 
we work with a macro- averaged functionrl Second, the possible 
number of distinct participants in a thread depends on the thread's 
length, and so we need to parametrize by it. 

Thus, formally, for a population of users V, let Vk be the set of 
users who authored at least one post having comment thread length 
at least k. For each user u £ Vk, we take all full threads associated 
with a post by u that produced at least k comments, and we truncate 
each of these threads to the prefix consisting of just the post and 
the first k comments. Let 5„(fc) be the average number of distinct 
participants in all these prefixes of full threads initiated by u. (For 
a given such prefix, the number can range from 1 — the original 
poster contributed all of the comments as well — to fc + 1 — all 
commenters are distinct, and the original poster didn't comment.) 
We then define A^. (d) to be the fraction of users u £ Vk for whom 
lSu{k)\ = d. Note that A^ is a density function. In what follows, 
for brevity we will sometimes refer to it simply as an average or 
an expectation, with the understanding that this refers in fact to a 
macro-averaged quantity. 

In these terms, our conjecture can be expressed as follows: for 
threads of sufficient length k, the density function A^ (d) should 



have (at least) two local maxima: one at a small value of d, i.e., 
d <^ k, and one at a large value of d, i.e., d « fc + 1. 

In Figurefl] we show the family of density functions A^ for k £ 
1, 2, ..., 50 on our uniform Facebook population W. The densities 
are drawn as a heat map, with column k representing the density 
function A^. We see that as fc increases, A^(d) is first maximized 
at d = 4, reflecting the dominant role of the focused effect; but 
then, a second local maximum emerges at a value of d very close 
to fc. For the population A of high-activity Facebook users, we 
see essentially the same effect, including the two local maxima at 
d — 4 and d close to fc (figure omitted for space). 

Although the smaller data volume makes it more difficult to dis- 
cern the effect on Wikipedia, when we group together the possible 
values of d into contiguous intervals, we find significant evidence 
of two local maxima there too. To quantify the effect on Wikipedia, 
we compare quantiles of AJ!, defining /fe(p, g) = 2_, ^k{d)- 

pk<d<qk 

We find that as fc increases (in particular, considering fc > 15), we 
have fk{0, i) > fkil, I) and/fc(|,l) > fk{^, f). This inequal- 
ity is consistent with Figurefl] where the density function is larger 
at the two extremes than in comparably-sized intervals in between. 

Consequences: Predict Both Length and Re-entry. As argued 
earlier, conversational-curation systems should contain a thread- 
length prediction component. But our new observation about the 
distinction between expansionary and focused threads shows that 
long threads can differ significantly in the extent to which a com- 
menter will want to return to contribute a second time. This het- 
erogeneity in long threads motivates the formulation of our second 
task, re-entry prediction: determining whether a given participant 
in the thread is likely to contribute again. To our knowledge, re- 
entry in on-line conversations is a problem that has not been previ- 
ously formalized or studied. 

Length and re-entry are important, and distinct, issues in the task 
of conversational curation. Length prediction, since it provides in- 
formation about the amount of attention a thread is likely to re- 
ceive, helps in assessing whether a user should be made aware of 
the thread at all. Re-entry prediction, on the other hand, provides 
information about how to keep a user informed of the evolution 
of the thread once he or she has already contributed to it: a high 
re-entry probability indicates that the user may well want to know 
about subsequent comments, so that he or she can contribute in re- 
sponse to them. 

Predicting a particular user's re-entry is different from predicting 
whether the thread itself will be focused or expansionary. While 
very few users re-enter an expansionary thread by definition, it is 
easily possible for a user u to contribute to a thread that later be- 
comes dominated by a back-and-forth discussion among a small 
set of other participants; in this case, the thread is focused, but user 
us re-entry probability might be low. Predicting re-entry provides 
a concrete recommendation with respect to a given user, in a way 
that predicting whether a thread will be focused or expansionary 
does not. 

We note that re-entry prediction is focused on a user's produc- 
tion of comments — specifically, whether the user will write an- 
other comment in the future. An interesting open question is to 
consider the analogous prediction task for a user's consumption of 
comments. In particular, a user might be interested in continuing to 
read comments on a thread as they come in, despite having no in- 
tention of contributing again. (Consider a string of congratulatory 
messages on a life event that include interesting side information, 
such as personal reminiscences or clever quips.) 



'The results turn out to be similar for the micro-averaged analog. 



4. EARLY PARTICIPANTS: SOCIAL, SEQUEN- 
TIAL, AND TEMPORAL STRUCTURES 

In this section, we siiow how properties of the initial participants 
in a thread can provide information about the thread's later dynam- 
ics, thus laying the groundwork for the features in our subsequent 
prediction experiments. First ( ^4. l[ l, we show that the presence or 
absence of social links among the initial participants in a thread 
turns out to provide useful information, though in interestingly dif- 
ferent ways for different settings. Second ( ^4.2| >, inspired by our 
expansionary vs. focused analysis in 0] which introduces the im- 
portance of re-entry, we develop a novel representation for the se- 
quence of participant contributions. Third ( ^4.3^ , we demonstrate 
that how fast the initial commenters arrive provides important in- 
formation about the eventual number of comments, though its con- 
nection with re-entry probability is less clear. 

4.1 Links Among the Initial Participants 

We first consider comment threads in our Facebook High- Activity 
population (outcomes are analogous for the Uniform set), focus- 
ing on threads with at least two comments and where the first two 
commenters are distinct from each other and from the post's au- 
thor. The tension between the focused and the expansionary effects 
has a natural reflection in the relationship between these first two 
commenters. If they are friends, then interest in the post might be 
limited to a particular portion of the poster's social neighborhood. 
That is, interest in such a post could have limited reach, which 
could restrict thread length. At the same time, though, there might 
also be increased potential for an extended conversation to ensue as 
friends interact, which would lead to a longer thread. 

The top-left plot in Figure [2] shows that in fact, these Facebook 
threads are significantly longer when the first two commenters are 
friends. 

We can further validate this hypothesized effect of conversational 
interaction by examining a related mechanism in which the role of 
interaction is much more limited. The Facebook "like" feature is 
very useful for this purpose. Users can respond to a post not just 
by commenting on it but also by clicking the like (thumbs-up) but- 
ton, which provides a one-bit endorsement of the content. Thus, 
likes are a light-weight communication alternative to comments, 
and we can consider "like threads" — the sequence of likes arriv- 
ing on a post — as the corresponding analog of comment threads. 
But there is a crucial difference: in like threads, there is no analog 
to the back-and-forth interaction that characterizes conversational 
interaction. 

When the first two likers in a like thread are distinct, how is the 
eventual length of the like thread affected by whether these two 
users are friends? The top-middle plot of Figure [2tb) shows that, 
in the absence of repeated interactions to offset its consequences, a 
limited-reach effect is clear: the like thread is shorter when the first 
two likers are linked|j 

Applying the same analysis to Facebook thread prefixes of length 
k — 3,4, and 5 yields very similar results. For space reasons, we 
only depict the case fc = 3 (bottom-left and bottom-middle plots 
in Figure [2|, but the results are that expected length of comment 
threads continues to increase almost perfectly monotonically in the 
number of edges among the first k commenters when they are all 
distinct. Also, we find completely analogous results for re-entry 
in Facebook threads: the re-entry of the first participant increases 
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We note that measuring the number of distinct commenters shows 
the same limited-reach effect: although the comment thread is 
longer when the first two commenters are linked, the total number 
of distinct commenters is smaller. 



Figure 2: Thread length vs. number of connections be- 
tween the first k commenters when they are distinct. More- 
connected commenters malte for longer Facebook (FB) com- 
ment threads but shorter "like threads" and Wikipedia (WK) 
discussion threads. (Y-axes not aligned to heighten trend visi- 
bility.) 

strongly with the number of edges among the first k commenters 
when they are all distinct. 

What about Wikipedia comment threads? As depicted in the 
rightmost column of Figure [2] in this domain, it is not the case 
that more connections between the first participants leads to longer 
threads, although the available data here is quite sparse. We conjec- 
ture that the root cause for the striking contrast to Facebook may be 
the task-oriented nature of the setting, in which conversations may 
be less discursive, and editors who have interacted in the past may 
be more conversationally efficient in reaching a conclusion. 

It is interesting to note that earlier work of Ugander et al. consid- 
ered the level of connectedness among the set of users who appear 
in an invitation to join Facebook |24|; invitations that displayed 
users who were not linked to each other had higher overall conver- 
sion rates than invitations that displayed linked users. While in- 
vitations and comment threads are clearly different in nature, they 
both involve opportunities to engage a user in the activity of the 
site; whether there is a deeper relationship between the connected- 
ness of commenters here and the connectedness of inviters in that 
setting is an interesting open question. 

4.2 Arrival Patterns 

Having looked at the number of distinct commenters, and at the 
graph structure on the first few participants in the case when they 
are distinct, we now develop a general method for representing the 
precise sequence of arrivals of the first few participants, and show 
that these sequences have the potential to be very useful features in 
our prediction tasks. 

For a comment thread t, let ti (for i — 1,2,...) denote the iden- 
tity of author of the i*'^ comment in the thread. We now define the 
following encoding 7(f) of comment thread t. 7(t) is a sequence 
of non-negative integers; the i^^ entry in the sequence, denoted 
'y(t)i, is equal to if ti is the author of the post that started the 
thread (returning to the thread in this case as the i"^ commenter), 
and otherwise 7(f)i is equal to the value of j such that ti is the j**^ 
distinct commenter to take part in t. In what follows, we refer to 
7(i)i as the ID code of commenter ti. We will also use the term ar- 
rival pattern to refer generically to any prefix of 7(f) (including the 
full sequence). Figure [3] illustrates these concepts via two sample 
discussions, exemplifying that focused threads should have arrival 
patterns in which some back-and-forth between two participants is 
evident, whereas expansionary threads should have arrival patterns 
in which all ID codes occur very few times, mostly just once. 

Can early (i.e., short) arrival patterns serve as useful features for 
our prediction tasks? Before describing our full experiments (de- 



Facebook High-Activity, Length-5 anival patterns | 


pattern 


1 re-enters 


% of occ. 


1,0,1,0,1 


55.2% 


19.2 


1,0,1,0,0 


47.5% 


2.8 


1,0,1,0,2 


26.7% 


4.9 


1,0,1,2,0 


26.1% 


4.1 


1,0,2,0,2 


16.5% 


4.6 


1,2,0,2,0 


14.6% 


3.1 


1,0,2,3,0 


12.5% 


1.9 


1,2,0,3,0 


11.5% 


2.0 


1,0,2,0,3 


10.9% 


2.6 


1,2,3,4,5 


5.6% 


3.6 






sum: 48.8 



Wikipedia, Length-5 arrival patterns | 


pattern 


1 re-enters 


% of occ. 


1,0,1,1,0 


60.2% 


2.4 


1,1,0,1,0 


58.6% 


2.8 


1,0,1,0,0 


55.3% 


4.8 


1,0,0,1,0 


52.2% 


5.5 


1,1,1,1,1 


47.5% 


2.3 


1,0,1,2,1 


46.0% 


3.2 


1,0,1,0,2 


45.8% 


2.7 


1,2,1,2,1 


41.1% 


5.1 


1,0,1,0,1 


38.8% 


27.0 


1,2,3,4,5 


7.2% 


1.7 






sum: 57.5 



Facebook High-Activity, length-9 arrival patterns | 


pattern bins 


1 re-enters 


% of occ. 


#0:3, #1:6 


67.7% 


1.7 


#0:4, #1:5 


66.9% 


12.1 


#0:5, #1:4 


65.5% 


4.0 


#0:3, #1:4, #2:2 


56.8% 


2.1 


#0:3, #1:3, #2:3 


50.9% 


1.7 


#0:4, #1:4, #2:1 


47.6% 


5.2 


#0:4, #1:3, #2:2 


38.5% 


3.5 


#0:4, #1:3, #2:1, #3:1 


28.2% 


2.0 


#0:4, #1:2, #2:3 


22.3% 


2.8 


#0:4, #1:1, #2:4 


9.6% 


2.7 






sum: 37.8 



Table 1: Left and middle: the most common length-5 arrival patterns on Facebook, accounting for 48.6% of the occurrences of all 
possible such arrival patterns, and on Wikipedia, accounting for 57.5% of all occurrences of all possible such patterns. The patterns 
are sorted by the percentage of corresponding threads in which the user with ID code 1 returns to the thread to comment again. " % of 
occ": percentage of threads of length > 5 prefixed by that pattern. Right: The same for the most common length-9 arrival patterns, 
except that patterns have been binned by counts of ID codes since there are many possible length-9 patterns. For example, "#0:3, 
#1:6" — the set of length-9 patterns where ID code occurs 3 times and ID code 1 occurs 6 times, in any order. (Some populations 
omitted for brevity or due to data sparseness.) 





focused thread 


Mary: 
Mary: 
Don: 


Anyone there 
7 

me 


Pat: 


not me 


Don: 
Pat: 


V funny 
i know 



Length-2 arrival pattern: 0,1 
Length-5 arrival pattern: 0,1,2,1,2 



expansionary thread 



James: 


we're engaged! 


Dina: 


Congrats ! 


Fred: 


Congrats ! 


Mia: 


great!!! 


Moe: 


great! 


James: 


Thanks guys :) 



Length-2 arrival pattern: 1.2 
Length-5 arrival pattern: 1,2,3,4,0 



Figure 3: Example conversations demonstrating our arrival- 
pattern coding scheme for the comment portion of threads. 

tailed in the next sections), it is useful to show some preliminary 
evidence of these patterns' potential utility. 

First, we see whether different (early) arrival patterns tend to 
correspond to different thread lengths. Figure H] shows, for each 
of our three populations, the (macro-averaged) length of threads 
whose length-two prefixes correspond to each of the five possible 
length-two patterns; the fact that the mean thread lengths fall in 
mostly disjoint confidence intervals indicates that the patterns do 
have predictive valuer! 

Second, we see whether different arrival patterns tend to corre- 
spond to differing re-entry probabilities, focusing on the chance 
that the user with ID code 1 (i.e., the first commenter who isn't the 
original poster) subsequently re-joins the thread by adding another 
comment. Table [T] demonstrates that arrival patterns carry signifi- 
cant information about ID code I's re-entry probability. In all the 
populations shown, it appears that guestboolc-style patterns con- 
taining many distinct ID codes tend to results in noticeably lower 
re-entry probabilities. For Facebook, we also see a strong positive 
correlation between the number of times ID code 1 appears in an 
arrival pattern and the likelihood that ID code 1 will subsequently 
appear again. 

4.3 Timing effects 

^ We note that the two most frequent arrival patterns in all three 
populations are (1,0) and (1,2), which is interesting because (1,0) 
corresponds to the canonical turn-taking structure in a pairwise 
conversation, while (1,2) is the canonical sequence of successive 
new arrivals — a further reflection of our focused/expansionary di- 
chotomy. 
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Figure 4: In all populations, the 95% -confidence-intervals for 
mean thread length for the five possible length-2 arrival pat- 
terns — indicated as labels on the intervals — are almost all 
disjoint. Grey/dashed intervals indicate rare arrival patterns 
(at most 1% of threads), so the long interval involved in the sin- 
gle overlap (0,0 in Facebook Uniform) is for a sparse situation. 

Our analysis thus far has considered the sequence of commenters 
without any information about the speed at which they arrive in real 
time. We now show some basic results establishing that this type of 
temporal structure contains important information about the length 
and re-entry properties of threads; in the next section, we use this 
information as part of our prediction methods. 

In Figure B] we see (black curve) that the longer it takes for the 
first comment to arrive on an initial post, the shorter the thread, 
presumably because "late" first comments correspond to less over- 
all activity around the post. But note that timing isn't everything: 
beyond a certain point, the probability that the first commenter re- 
enters a thread (green curve) becomes approximately independent 
of the first-comment arrival time lag. 

5. PREDICTING THREAD LENGTH 
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Figure 5: Black (left axis, top curve): the longer it takes for the 
original post to attract its first comment, the lower the expected 
thread length. Green (right axis, bottom curve): in contrast, 
the probability that the first commenter re-enters the thread 
is eventually independent of the first comment's arrival time. 
The y — 3 and y — 20% lines are included to allow for visual 
comparison of the real data curves with theoretical curves in 
which the arrival lag has no effect. 



We now engage in the two main prediction tasks of tliis paper. 
Recall tiiat tlie first task, whicli we describe in tliis section, is to 
predict thread length, as an indication of how much interest a post 
will eventually generate, given the state of the thread at a certain 
early point. For example, we ask, given a thread initiated by a Face- 
book user posting a status update to their friends that has already 
accumulated 5 comments, how well can we predict the final length 
of the threadtj The results of the second task, to predict whether a 
user that has already participated in a thread will later re-enter that 
same thread, are described in Section|6] 

For thread-length prediction, we formulate two concrete tasks 
on the Wikipedia and Facebook High- Activity datasets (the results 
on the Facebook Uniform dataset are similar, but with larger er- 
ror bars). From the set of all posts made by the active Facebook 
users, we selected the subset of posts that received at least 5 com- 
ments, and randomly reserved 50% of them for evaluation, using 
the other 50% for feature exploration. This gave us a test data set 
of 1,996,624 posts. Out of these, we chose a threshold of 8 com- 
ments to create an approximately balanced binary prediction prob- 
lem: given the state of the thread after five posts, will the thread 
eventually receive at least 8 comments? (55.25% of the posts are 
in the positive class.) Similarly, for Wikipedia, we look at all talk- 
page posts that have received at least two comments, and ask, will 
they receive a third one? In this case, our data is smaller, with only 
44,732 items in the test set (54.55% of which are in the positive 
class). 

5.1 Features (used here and in 

The features we employed are summarized in Table [2] The first 
three sets are based on our discussion above of links between par- 
ticipants ( §4.1^ , arrival patterns ( ^4.2^ , and timing effects ( §4.3^ . 
We describe the other two sets now. 

An important question is whether the textual features of the orig- 
inal post are more or less effective for this task than the non-textual 
features we have already described. To investigate this issue, we 
elected to gather a small, presumably general set of such "Origi- 
nal post terms" features via text regression, which has previously 
been employed for blog comment- volume prediction 1261 . Specif- 

"* Naturally, for this task, we use only features that can be derived 
from the state of the thread when it had 5 comments. 



edges_prev[iY 


Number of links from commenter to 
previous commenters 


mutual_poster [i] * 


Number of links from commenter to 
users linked to the original poster 



ARRIVAL PATTERNS 



id_code[i] commenter ID code as described in 5 4.2 



umq_comTn[i\ 



Unique commenters through comment i 



TIME 



time\i\ 



Time taken for the first i comments to 
arrive 



TEXT REGRESSION FEATURES 



Orig_post_terms "comment", "agree", etc.: see 5 5.1 



num_words[i] 


Number of words in comment i 


num_chars[i] 


Number of characters in comment i 


questionli]* 


Comment i has a '?' 


exclaim[i\* 


Comment i has a ' ! ' 


likes[i\* 


Num likes on original post before 
comment i is made 


comment_likes [i] * 


Num likes on comments before comment 

i 



Table 2: Features used in our prediction experiments. For each 
indexed feature, we also build a comparable feature for the 
original post when it makes sense (the id_code for the original 
post is always and so is omitted but, for example, the length 
of the original post in words or characters is meaningful). Fea- 
tures marked with * were applied only for Facebook data. 

ically, we used J. M. White's TextRegression R package, which 
employs linear regression with elastic-net regularization |9|, run 
on a set of posts disjoint from the training and test data used for 
classification. 50 terms were selected for the Facebook data — 
among them were "comment" and "anybody" (positive coefficient 
for thread length), and "re-post" and URLs (negative coefficient). 
Among the 30 selected terms for Wikipedia were "agree" (positive) 
and "thank" (negative). 

Also, preliminary pilot studies revealed a set of fairly intuitive 
miscellaneous features, listed in the last section of Table[2] that are 
potentially correlated with thread length. For instance, one might 
expect that on average, posts containing a question mark pose ques- 
tions that prompt comments as responses. 

5.2 Performance Results 

Our testing methodology was: for a given set of features and 
train/test set, create bagged decision trees with 60 trees trained on 
independent samples of the training data; then, apply the bagged 
decision trees on the disjoint test set. 

Our main method was to use all the features described in Table[2] 
We compared its performance against the following two baselines. 
The positive-percentage bias baseline chooses an item's label ran- 
domly with bias equal to the percentage of test items in the positive 
class (55.52% in the Facebook case, 54.55% in the Wikipedia case). 
The text-regression baseline uses only the Orig_post_terms fea- 
tures chosen via text regression as described in ^5.1| 

The performance of our method versus the two baselines is shown 
in Table [3] Clearly, the combined use of participant-link, arrival- 
pattern, timing, and other information yields the best results for all 
five of our performance metrics. The small set of text-regression 







ACC 


AUC 


RMSE 


APR 


CXE 


FB 


Pos.-% bias baseline 
Text baseline 
All our features 


.552 
.537 
.672 


.500 
.529 
.729 


.497 
.503 
.457 


.550 
.568 
.758 


.992 
1.01 
.872 


Wiki 


Pos.-% bias baseline 
Text baseline 
All our features 


.548 
.488 
.595 


.500 
.505 
.627 


.498 
.517 
.486 


.549 
.550 
.661 


.993 
1.06 
.958 



Table 3: Main thread-length prediction results. Bold = best 
performance per dataset, under various metrics: ACC: accu- 
racy (for FB active: after 5 comments, predicting whether the 
thread achieves length > 8; for Wiki: after 2 comments, pre- 
dicting whether an additional comment will occur). AUC: area 
under the ROC curve. RMSE: root mean square error. APR: 
mean average precision. CXE: cross-entropy. 



features extracted from the original post sometimes did worse that 
the positive-percentage bias baselinerl 

Key Facebook Features. To better understand the individual fac- 
tors contributing to the length of a comment thread, we perform 
stepwise forward feature selection. In iteration j of this algorithm, 
we create working feature set Fj by finding the best single feature 
to add to the set Fj_i to maximize our objective function, area un- 
der the ROC curve (AUC). Because it only selects a single feature 
at a time, this method prevents us from adding more than a single 
copy of highly correlated features, and the order that the features 
are installed gives us some insight into the nature of these comment 
threads. 

TableHlshows the features selected by this process for the Face- 
book dataset. There are three things worth noting from these re- 
sults. The first is that a relatively small set of features contributes 
almost all of the predictive value. In particular, the amount of time 
it takes for the first five comments to arrive (the TIME:iime[5] fea- 
ture) is highly indicative of whether or not the thread will eventu- 
ally reach 8 comments. Second, most of the key features come from 
the fifth comment. Thus, when predicting whether or not the thread 
will continue, one should focus on the most recent activity of the 
thread. We highlight this in Figure [6] where we show the predic- 
tion performance when using only the subsets of features derived 
from a single message in the thread, ranging from the original post 
(x = 0) to the fifth comment. The third item of note is the fact that 
the link-based features do not have much effect. We believe this 
is because they are low-recall, in the sense that we only showed 
in ^4.1 1 that they are useful when all the early commenters in the 
thread are distinct. 

Given the strength of the time feature, it is interesting to ask what 
would be the effect of its removal. The combination of the other 
features is unable to make up for the loss of temporal information: 
removing that key feature, the AUC drops from 0.729 to 0.588. 
With or without that feature, and even if we slice the data to pre- 
dicting only for a fixed TIME:fi77ie[5] £ [IStti, 207Ti), the relative 
ordering of the other features remains more or less unchanged. 

Key Wikipedia Features. In the case of Wikipedia, we see a some- 
what similar ordering to the features. Again, the features regarding 
the most recent comment (here, the second one) are the most pre- 



Feature added 


AUC 


TIME:time[5] 


0.6954 


-1- ARRIVAL PATTERN:itn jg_comm [5] 


0.7053 


-l-MISC : num_words [5] 


0.7138 


-l-TIME:time[3] 


0.7214 


-l-MISC :question[^] 


0.7256 


-^ ARRIVAL PATTERN:id_code[5] 


0.7258 


-1- ARRIVAL PATTERN:wnig_comm[4] 


0.7260 



Table 4: Results of stepwise forward feature selection on Face- 
book. Each row represents performance for all features listed 
in that row and above. 



I Facebook High-Activity 




This is consonant with De Choudhury et al. |8|, who remark that 
"textual analyses ... alone are not adequate to capture conversa- 
tional interestingness because [they] do not consider the dialogue 
structure between users". 



Features from comment x only 

Figure 6: Performance when predicting using only the features 
derived from a single comment (or original post for x — 0). The 
later the comment, the more informative. 

dictive of future comments. Here, however, we find that the length 
of the second comment is the most important feature, followed by 
the time to the second comment, and the ID code of the second 
and first commenters. Beyond these first four features, the rela- 
tively small size of the dataset makes the predictive power of other 
features unclear. 



6. PREDICTING THREAD RE-ENTRY 

Here we examine our second, and novel, prediction task: given 
the initial portion of a thread and the identity of one of the com- 
menters, how well can we predict whether that commenter will 
contribute another comment later in the thread? As noted earlier, 
the idea here is to determine whether to keep a user notified of the 
progress on a thread after they have commented on it already — 
there are some threads for which the user might want to actively 
return to the discussion. For space reasons, we can only provide an 
overview of results here, omitting detailed feature analysis. 

For simplicity, we focus on the following two (related) questions. 
Recall that we use ID code for the original poster of a thread, and 
ID code 1 for the first commenter other than the original poster, 
assuming there is such a commenter. (a) Assuming that ID code 1 
occurs in the length-5 arrival-pattern prefix, does that user ever ap- 
pear again? (The value 5 was used in our thread-length prediction 
problem as well.) (b) The same, but for the first 9 comments. We 
use the same features as in the previous task; see ^5.1| and Table[2] 

Using cross-validation, we find (Table [5) that the performance 
on the full feature set for Facebook is an AUC of 0.855 for the 9 
comment version, and 0.808 for the 5 comment version of the task. 
Using the same feature selection methodology described above, we 
find that the most important features are the identities of the in- 
dividuals posting the comments (id_code[i]), and especially the 
identities of the most recent few commenters. The time between 
the two most recent comments also plays an important role, as the 
longer it takes, the slower the conversation is moving, and the more 
likely it is to come to an end. 







AUC (x-val) 


FB (after 5 comments) 


Pos.-% bias baseline 
Text baseline 
Our features 


.500 
.520 
.808 


FB (after 9 comments) 


Pos.-% bias baseline 
Text baseline 
Our features 


.500 
.525 
.855 


Wiki (after 5 comments) 


Pos.-% bias baseline 
Text baseline 
Our features 


.500 
.494 
.644 



Table 5: Main thread-re-entry prediction cross-validation re- 
sults. Bold marks the best performance per dataset. 

7. MODELING THREAD RE-ENTRY 

Having gained some empirical understanding of thread re-entry, 
including relatively good performance at predicting it, we now seek 
to develop further theoretical understanding of re-entry by formu- 
lating a set of probabilistic generative models that produce arrival 
patterns of a given fixed length. We then study which of these mod- 
els produce the qualitative phenomena we observe in real threads 
— particularly bimodality in the number of distinct commenters. 

The first class of models T we consider has the following basic 
structure for choosing who makes the j*'^ comment, reminiscent of 
the Chinese Restaurant Process |1|. With some fixed probability 
•Pi > 0, we introduce a new participant; with probability 1 — pj we 
select, according to some underlying probabilistic rule, a partici- 
pant who has already appeared in the thread. (We refer to such par- 
ticipants as re-entrants). The re-entrant selection rule is assumed 
to be a randomized algorithm 6 that takes a thread prefix as input 
and produces the name of an existing participant in the thread. This 
is a very general definition; depending on the choice of the func- 
tion 6, we can define arbitrary rules, that, for example, pick a re- 
entrant uniformly, or according to "rich-get-richer" principles that 
favor people who have commented more in the past |15 1, or accord- 
ing to recency principles so that an individual's selection probabil- 
ity decreases in the time since they last commented. Each model 
Q.{k, 6, p) in this class J^ is described by a thread length fc, selec- 
tion rule 6, and sequence of probabilities p = pi,p2,P3, ■ • ■ ,Pfc, 

P. e(o,i]. 

A Negative Result. Although J^ initially seems reasonable and 
covers a large space, it turns out to be a poor fit to reality, because 
none of its members can yield the expansionary vs. focused bi- 
modality that we found empirically in ^ 

Theorem 1. Let Q,{k, 6, p) be an arbitrary model in the class 
J-, and let X be a random variable equal to the number of dis- 
tinct participants in a length-k thread t generated by Q{k, 0, p) 
(counting the initial poster). Then X has a unimodal density func- 
tion: there is a number d* such that Pr [X = d] is monotonically 
increasing for d < d* and monotonically decreasing for d > d*. 

We omit the proof due to lack of space, but it consists essentially 
of projecting the arrival pattern onto a binary sequence that records 
only whether each participant is a re-entrant or not. 

Models Exhibiting Bimodality. In view of this negative result, 
we seek an alternate class of models capable of generating arrival 
patterns that exhibit bimodality in the number of distinct commenters. 

Arguably the simplest approach is to consider mixture models 
that have bimodality "built in": We need only suppose that there 



are two distinct types of posts, one which concentrates the num- 
ber of distinct participants on a small value, and the other which 
concentrates it on a large value, and that threads are constructed by 
drawing one of the first type with fixed probability tt > or one of 
the second with probability 1 — tt. 

While this mixture principle is presumably an important reason 
why we see bimodality in the real data, it is not the whole story. In- 
deed, we ran the following experiment to see whether the same type 
of post can lead both to focused and expansionary threads. As it 
turns out, the CNN link that was most shared among a large sample 
of Facebook users in the first quarter of 20 1 2 was a report of Whit- 
ney Houston's death. Although the set of threads spawned just by 
shares of this link is small by the standards of FigurefT] it is large in 
an absolute sense, and we observed in this controlled-content case 
the same sort of bimodality exhibited by threads overall: some- 
times, the news provoked a series of "drive-by" comments when 
it was shared by a user, and other times, the same news prompted 
extended small-group discussion. 

This finding motivates us to construct models of arrival patterns 
that produce the expansion/focus bimodality as a byproduct with- 
out assuming post type as its cause. To do this, we posit a type of 
internal symmetry-breaking during thread generation, taking inspi- 
ration from the theory of nonlinear urn processes |2|. In this new 
class of models, the probability that a new participant enters at step 
j depends on the identities of the participants in the first j — 1 steps. 
Intuitively, when there are many distinct participants, the process 
should make re-entry less likely, thereby producing momentum in 
the expansionary direction; when a few participants have each in- 
teracted multiple times, the process should make it harder for new 
participants to break in, thereby building up momentum in the con- 
versational direction. 

The class is parametrized by a > 1 and /3 > 0. For each par- 
ticipant c already in the thread (including the original poster, 0), 
and each length j < k, each such existing participant will have a 
weight Wj (c) after step j of the thread that controls their probability 
of providing the next comment. The fixed weight /3 > controls 
the probability that a new participant arrives in the next step. We 
also impose the constraint that the same person never appears twice 
in a row|j 

Generating the arrival pattern 7 = 71 ■ ■ • 7fc proceeds as follows. 
The first commenter will be labeled 1 (since we do not have the 
poster, labeled 0, provide the first comment too); so we initialize 
by setting 71 — 1, wi(0) = lui(l) = 1, and following this ini- 
tialization we are positioned to determine the author of the second 
comment. In general, consider an arbitrary step j < k, and let Cj 
be the commenter in that step. We proceed as follows. 

(i) Choose commenter j + 1. We choose a participant (different 
from Cj ) with probability proportional to the weights. Specifi- 
cally: pre-existing participant c 7^ Cj is chosen with probabil- 
ity Wj(c)/(/3 + "^^c'j^c- ^ji'^'))' ^^'^ ^ r*^^ participant is in- 
troduced into the thread with probability /3/(/3+X]c'3^c ^i ('^'))- 
We use Cj+i to denote the participant chosen for step j + 1. 

(ii) Update weights. If the participant Cj+i in step j + 1 is a re- 
entrant, we define Wj+i{cj+i) — awj{cj+i), and leave all 
other weights unchanged. If instead Cj+i is new, we define 
Wj+i{cj+i) — 1 and for all other pre-existing participants 
c 7^ Cj+i we reduce their weights by setting toj+i(c) = 
Wj{c)/a. 



This is essentially without loss of generality, since on the real 
threads we can also build a comparable representation where we 
collapse out consecutive occurrences of the same participant. 




Figure 7: Density function of distinct participants in tlireads 
produced by our proposed family of processes witli a — 1.5, 2, 
and 4 (/3 = 1 and fc = 40). 

The key point is tlie weighit update rule in part ([nl. A new arrival 
suppresses the weight of all existing participants, making it less 
likely they will comment again and paving the way for further new 
arrivals. On the other hand, when an existing participant provides 
the next comment, their increase in weight makes it more likely 
they will return, thereby promoting back-and-forth interaction. 

We show via simulation that bimodality emerges naturally in this 
model. To paraphrase Langston Hughes, the number of distinct par- 
ticipants can dry up like a raisin in the sun, or it can explode. Figure 
jTlshows the empirical density function obtained through simulation 
for the number of distinct participants under multiple settings of the 
model parameters: we fix the length k — 40 and P — 1, and then 
we simulate the process with a = 1.5, 2, and 4. As we see there, 
bimodality emerges as a increases, which accords with intuition 
— larger values of a are more aggressive in amplifying both the 
focused and expansionary effects, and hence serve to bifurcate the 
process into its two modes more strongly. 

The model appears to be quite challenging to analyze rigorously, 
and it is an interesting open question to prove that it produces bi- 
modality, as well as to characterize the transition from unimodality 
to bimodality as we increase a. The model shares some properties 
with nonlinear urn processes |2|, but also has ingredients that lie 
beyond what is usually needed for the analysis of such processes. 

8. RELATED WORK 

To our knowledge, there has not been prior consideration in the 
literature of the overall problem of algorithmic conversation cura- 
tion — an emerging key component in enhancing user experience 
in current forms of on-line social interaction. This problem in- 
volves many issues, including those investigated in this paper: (a) 
determining which posts are interesting enough to bring to a user's 
attention; (b) among discussions a user already has knowledge of, 
choosing which the user should continue to be updated about; and, 
indirectly, (c) understanding the structure of discussions, both to 
aid in the two issues just described and potentially for implications 
in user-interface design. Of course, there has been much valuable 
work on the first and last issue individually, which we now describe. 
(Our attention to (b) appears to be novel.) On (a), we point out De- 
Choudhury et al.'s research |8| on the interestingness of Youtube 
comment threads, as measured by interestingness of topic and par- 
ticipants (not length), and Shmueli et al.'s work fSl on predicting 
which stories a particular user is most likely to comment on. Prior 
work on comment- volume prediction 1 3 , 14 23 , 25 , 26 1 is of course 
also quite relevant. How fast a piece of information spreads or dif- 
fuses (3ll4l ll6lll8ll21l is another important aspect of interesting- 
ness. Quality of posts or comments, as determined by ratings, is 
potentially also relevant; see for example Siersdorfer et al. 1221 . 
On (c), there is intriguing work fl2' '151 on structural characteri- 
zations of discussions when viewed as trees (not an approach we 
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Figure 8: Considering only 8-i-word posts tliat generated re- 
sponses, for Facebooli, tlie more distinctive tlie text of tlie orig- 
inal post, the more comments it garners, but for Wiltipedia, 
wliich is more task-oriented, there is no such effect. 

have taken in this paper, and arguably less natural as a model for 
discussions on sites like Facebook that have post-and-comment- 
interfaces). Different perspectives are taken by researchers look- 
ing at characterizations of agreement and/or sentiment among com- 
ments l:5l llll[T9ll20l . and by sociological analyses of turn-taking 
in group conversations flOl . 



9. FUTURE WORK: DISTINCTIVENESS 

We see many exciting future directions to pursue. Here, we 
briefly highlight two preliminary explorations into distinctiveness 
features that we believe hold promise, although we have not yet 
identified a way of applying them in their current form to improve 
prediction performance. The main idea is that the likelihood of a 
post's text or early commenters should be informative. 

Distinctiveness of Text. A basic property of a piece of text is its 
likelihood — whether its word choices look typical when compared 
to a reference collection, or whether its word choices are less likely 
and hence more distinctive. In recent work, measures of distinc- 
tiveness were shown to help in recognizing movie quotes that were 
deemed "memorable" in the sense of cultural penetration (6). In 
our case, the question is the following: When a post contains un- 
usual text, what should this lead us to estimate about the length 
of the resulting comment thread? There are intuitive arguments in 
both directions: some low-probability posts might generate discus- 
sion because they are provocative and unexpected, but others might 
simply be hard to understand and thus be mainly ignored. 

We built a unigram language model from 3.5 million Facebook 
posts by authors whose posts weren't in our main dataset; for each 
word 10, the model provides a probability p{w). We define a post's 
distinctiveness to be the average over its tokens w of log(l/p(i(;)); 
lower distinctiveness means a more likely post. Figure [8] shows 
the macro-averaged post length as a function of text distinctive- 
ness, considering only posts containing at least 8 wordajand that 
received at least one comment in the case of Facebook or at least 
two for Wikipedia. For these particular subsets, our two Facebook 
populations exhibit a clear positive effect of the distinctiveness of 
the text, whereas for Wikipedia there seems to be no effect at all 
(which perhaps stems from the task-oriented nature of Wikipedian 
discussions). We note, however, that the effects become less clear 
if we include posts that turned out to generate no comments (or at 
most one on Wikipedia). 



'At 8 words and beyond, post distinctiveness becomes empirically 
almost independent of post length, disentangling the two features. 




1 

First Commenter's Distinctiveness as First to Comment 

Figure 9: In Facebook, the more distinctive the first commenter 
is (in terms of not often being first to respond to the original 
poster), the longer the thread. Wikipedia is not depicted due to 
sparseness, but the overall trend is the opposite. 



Distinctiveness of First Commenter. For a user u who posts reg- 
ularly, a set of frequent commenters on u's threads often emerges 
— the people who generally weigh in when u says something. 
Thus, shifting from the likelihood of words to the likelihood of 
users, it makes sense to ask about the effect on thread length of 
the first commenter's distinctiveness — the extent to which this 
commenter is usually or rarely first in one of it's threads. Again, 
there are intuitive arguments each way: if the first commenter v 
is someone who's often a first commenter on u's posts, then v is 
presumably familiar to both u and the audience for u's comments, 
which could make it easier for the thread to grow; but it may also 
be socially easier to let v's comment pass by without much activity. 

In Figure[9]we show for Facebook the expected thread length as a 
function of the fraction of times the first commenter was not the first 
to respond to the original poster's postsj We see a clear upward 
trend: when someone you rarely hear from first is in fact the first to 
comment on your post, on average it foreshadows a longer thread, 
perhaps because this indicates that the post has greater reach. 

We note that Wikipedia appears to exhibit the opposite behavior. 
We do not depict the Wikipedia results due to sparseness of recur- 
ring first commenters, but when restricting to users with at least 10 
posts and binning the distinctiveness values, we see a significant 
decreasing trend. 

10. CONCLUSIONS 

Motivated by the growing role of automated mechanisms to man- 
age users' interactions with on-line discussions, we have identified 
and studied two key problems in the curation of such discussions. 
The first of these, length prediction, is related to earlier studies of 
comment volume on blog and news sites, but it acquires additional 
complexity in our context due to the heterogeneity we find in long 
threads, which can either be focused on a few participants or ex- 
pand to reach many. The second problem, re-entry prediction, has 
to our knowledge not been formulated previously; it is a crucial 
issue in applications that must decide when to notify users about 
updates to discussions in which they have participated. 

We see these two problems as helping to define the contours of 
the problem of conversational curation more broadly, and as such 
the results here suggest a range of further open questions. Among 
these are a deeper understanding of the features that can help pre- 
dict the trajectory of an on-line discussion from its early stages. 



and the integration of these techniques into systems that deliver 
discussion-oriented content to users in on-line applications. 
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For the uniform population, the plot only consider users with at 
least ten posts, although different threshold values do not greatly 
affect the resulting trends. 



